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00 ■ Abstract 



We investigate the limiting behaviour of random tree growth in preferential attachment models. 
The tree stems from a root, and we add vertices to the system one-by-one at random, according to 
a rule which depends on the degree distribution of the already existing tree. The so-called weight 
function, in terms of which the rule of attachment is formulated, is such that each vertex in the tree 
can have at most K children. 

We define the concept of a certain random measure on the leaves of the limiting tree, which 
captures a global property of the tree growth in a natural way. We prove that the Hausdorff and the 
I packing dimension of this limiting measure is equal and constant with probability one. Moreover, the 
■ local dimension of /U equals the Hausdorff dimension at //-almost every point. We give an explicit 
. formula for the dimension, given the rule of attachment. 

o 
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g : 1 Introduction 
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We investigate a family of tree growth models in which the tree stems from a root in the beginning, 
and vertices are added one at a time, the new vertex always attaching to exactly one already existing 
vertex. The rule by which the new vertex chooses its parent, is dependent on the degree distribution 
apparent in the tree at the time the vertex is born. The models can be either in discrete time, 
I when a vertex is born in every second, or in continuous time, then birth times are random. For the 
problems we discuss, these two versions are equivalent and can be translated into each other (details 
in Section [2. ip . The classical models and results of the area use the discrete setting. However, for the 
proofs we give, the continuous-time version is much more natural and convenient, so this is what we 
will use. 

This big family of models includes the Barabasi- Albert graph [T] for example, in which the linear 
preferential attachment rule reproduces certain phenomena observed in real-world networks (e.g. the 
power law decay of the degree sequence). This property of the Barabasi- Albert graph was proved in 
a mathematically precise way in |5] and, independently, in A wider class of models is considered 
in [151 for rigorous results on different cases of this model, see [191 EI]- 

The results mentioned above focus on the local behaviour of the random tree, namely, they give 
results concerning the neighbourhood of a uniformly random vertex, which is chosen from the tree 
after a long time of tree evolution. In this paper we concentrate on global properties of the limiting 
tree. 



1 



It is natural to pose the following question. Let us fix a vertex, say the first vertex in the first 
generation, just above the root. What is the "limiting success level" of this vertex, compared to the 
other vertices in the same generation? What we mean by this is the number of descendants of this 
vertex, after a long time of tree evolution, compared to the number of descendants of its brothers. 

Another formulation of the same question is to fix a vertex, let the tree grow for a long time, 
then choose a vertex uniformly at random from the big tree, and ask the probability that this random 
vertex is descendant of the fixed vertex. Clearly, if we look at these limiting probabilities for let us say 
the first generation, we get a distribution, itself being random, that codes an important information 
of the evolution of the tree. 

If one looks at the system of these limiting (as time evolution of the tree tends to infinity) random 
distributions on the different generations of the tree, it is tempting to ask something about the limiting 
measure of this system, when letting the generation level tend to infinity. We will define the above 
concepts properly, and will denote this overall limiting measure by fi. 

Having a random measure in our hand, which describes a global property of the limiting infinite 
system, it is natural to ask about the Hausdorff (and packing) dimension of this measure, for several 
reasons. First, these are the primary quantities capturing the scaling behaviour of the system, so they 
appear in statistical and Statistical Physics discussions. Secondly, these can actually be measured 
in (finite, but big) real systems, so they can be used to check the validity of models, or to tune 
parameters. 

On the other hand, the dimension of the measure depends on a parameter of the underlying metric, 
which is arbitrary. To rule out this (trivial) dependence, it is usual to ask about the entropy of the 
limiting measure, which depends on the growth process only. This is the natural equivalent of the 
dimension from a dynamical point of view. 

We prove the following results. 

1. The limiting entropies (as time tends to infinity) of the random measures on the different gen- 
erations converge to a constant with probability one, as we let the generation level to infinity. 
This constant h is called the entropy of the limiting measure fi. 

2. The Hausdorff and the packing dimension of the random limiting measure fj, are constant and 
equal with probability one. The entropy and the dimension satisfy the usual simple relation 
dimension = ijapunov'exponent ' (fT2]l . Moreover, the local dimension of // equals the Hausdorff 
dimension at ^-almost every point. 

3. Given the so-called weight function w, which determines the rule of the tree growth, we provide 
an explicit formula for the entropy, and thus for the Hausdorff dimension, in terms of w. 

The key to these results is a Markov process appearing naturally in the construction of a /^-typical 
leaf of the tree. After some discussion of the tree structure, the Markov property will be easy to see. 
Some technical difficulties will arise from the non-compactness of the state space. 

Our model is special in the sense that we only allow a finite degree for each vertex, but it is 
general in the sense that after having fixed the maximum number of children K a vertex may have, 
the weight function w, which determines the rule of attachment, can be any positive-valued function 
on {0, 1,... 1}. 

The paper is structured as follows: The model and the results are presented in Section [2j This 
also includes a brief discussion of related models and related results in Section 12.51 Section [3] contains 
the main line of the argument, and ends with the proof of the first two results. Section U] is devoted 
to proofs of lemmas which have been used but not proven in Section [3l Finally, Section [5] contains 
the proof of the last result. 



2 Notation, Definitions and Results 

We consider rooted ordered trees, which are also called family trees or rooted planar trees in the 
literature. 

In order to refer to these trees it is convenient to use genealogical phrasing. The tree is thus 
regarded as the coding of the evolution of a population stemming from one individual (the root of the 
tree), whose "children" form the "first generation" (these are the vertices connected directly to the 
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root). In general, the edges of the tree represent parent-child relations, the parent always being the 
one closer to the root. The birth order between brothers is also taken into account, this is represented 
by the tree being an ordered tree (planar tree). 

We only consider the case when every vertex can have at most X € N children. We assume K >2 
to avoid the trivial case when only one child is born per parent. (In that case the tree growth is 
linear and the tree has no interesting structure.) We use the index set I := {1, 2, . . . , K}, and also use 
I- := {0,l,...,i^-l}. 

The vertices are labelled by the set 

oo 

A/-= J r, where 1° := {0} , 

n=0 

as follows. denotes the root of the tree, its first-born child is labelled by 1, the second one by 
2, etc., and its last one by K, all the vertices in the first generation are thus labelled with the 
elements of I. Similarly, in general, the children of x = (ii, ^2, . . . , in) are labelled by {ii,i2, ■ ■ ■ ,in, 1)) 
(ii, 12, . . . , in) 2), etc. Thus, if a vertex has label x = {ii,i2, ■ ■ ■ ,in) ^ then it is the i^^ child of its 
parent, which is the i^^^i child of its own parent and so on. If x = {ii,i2, ■ ■ ■ , in) and y = {ji,j2, ■ ■ ■ ,ji) 
then we will use the shorthand notation xy for the concatenation {ii,i2, ■ ■ ■ ,in, ji, j2, ■ ■ ■ , ji), and with 
a slight abuse of notation for z G I, we use xi for {ii,i2, ■ ■ ■ ,in,i)- 

There is a natural partial ordering -< on Af, namely, x ^ z if x is ancestor of z, so if 3y £ Af, y 7^ 
such that z = xy. We use x ^ z meaning x ~< z ov x = z. 

We can identify a rooted ordered tree with the set of labels of the vertices, since this set already 
identifies the set of edges in the tree. It is clear that a subset G <Z N may represent a rooted 
ordered tree iff G G, and for each (ii, i2) • • • , ^n) £ G we have (ii, i2, . . . , in — 1) £ C if in > Ij and 
(ii,i2, • • • ,in-i) G G if in = 1. 

We also think of M as the complete rooted ordered tree. 

Q will denote the set of all finite, rooted ordered trees. The degree of vertex x £ G will denote the 
number of its children in G: 

deg(x, G) := max{i G I : xi G G} (zero if xl ^ G) 

The subtree rooted at a vertex x G G is: 

Gix ■■= {y : xy G G} , 

this is just the progeny of x viewed as a rooted ordered tree. 

2.1 The Model 

2.1.1 Continuous-time Model 

Given a function w :T' ^ M+, referred to as the weight function, our randomly growing tree T(t) is 
a continuous-time, time-homogeneous Markov chain on the countable state space with initial state 
T(0) = {0} and right-continuous trajectories. 

The jump rates are the following. Suppose that at some t > we have T(t— ) = G, then for 
each X G G which has deg(x, G) = j < K, the process may jump to G U {xi} with rate w(deg(x, G)) 
where i = j ' + 1. This means that each existing vertex x G T(t—) 'gives birth to a child' with rate 
tt;(deg(x, T(t— ))), independently of the others, and stops reproducing when reaches deg(x, T(t)) = K. 

The Markov chain T(t) is well defined for t G [0, 00), it does not blow up in finite time (see 
comment at ([3])). 

We define the total weight of a tree G G ^/ as 

W{G) := ^w{deg{x,G)) . 

Described in other words, the Markov chain T(t) evolves as follows: assuming T(t— ) = G, at time 
t a new vertex is added to it with total rate W{G), and it is attached with an edge to exactly one 
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already existing vertex, which is x G G with probabihty 

w{deg{x, G)) 
Ej/eG^(deg(2/,G)) ' 

2.1.2 Discrete-time Model 

This continuous-time model naturally contains another, discrete-time model as follows. Define the 
stopping times 

Sn :=mi{t : IT(i)[ =n+l}, 

then the Markov chain T(S'„) is a randomly growing tree, where exactly one vertex is born at each 
time unit, and every newly born vertex chooses its parent at random, choosing x with probability 

w{deg{x, G)) 
Ej/eG"^(deg(y,G)) ■ 

if the T(5„_i) = G. 

It was in this framework that Barabasi and Albert originally formulated their model [1]. The 
relation of the two models is discussed in detail in [21]. As mentioned before, the questions we pose 
can be formulated equivalently in both models, but we will use the continuous-time version in our 
proofs, for reasons of convenience. 



2.2 Some Additional Notation and Known Results 

Let Tx be the birth time of vertex x, 

Tx := inf{t > : X e T(t)} . (1) 

Let ax be the time we have to wait for the appearance of vertex x, starting from the moment that its 
birth is actually possible (e.g. when no other vertex is obliged to be born before him). Namely, let 

(a) 0-0 := 0, 

(b) ayi := Tyi - Ty, for any y eJ\f, 

(c) and ayi := Tyi — rj^(i_i), for each y £ M and i > 2, i € I. 
Let the function g : (0, oo) — > (0, oo) be defined as 

?(A):^Ei;e-.^i:n^. (2) 

i=i j=l i=0 ^ ^ ' 

The function g plays a central role in the theory of the branching processes related to our model, as 
discussed in [21] However, in the present work we use little of that relation - instead, we list here 
the known results that we will use. 

1. The equation 

^(A) = 1 

has a unique root A* > 0. This A* is called the Malthusian parameter. 

2. This A* gives the rate of exponential growth of the tree size almost surely. The normalized size 
of the tree converges almost surely to a random variable, which we denote by 

e := lim e~^'*[T(t)| . 

t—>oo 

3. is almost surely positive, and 

< EG < cx), (3) 
which implies (also) that almost surely the process T(t) does not blow up in finite time. 



-"^The reason for the notation g is that this function is the Laplace transform of the density of the point process formed 
by birth times in the first generation of the tree. 
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4. Moreover, 



Ee^ < oo. 



(4) 



The first statement is in our setting obvious from the definition, since we have assumed 2 < K < oo. 
The second and third are shown in [21]. The last statement is also implicit from [21] - the variance 
is even calculated. Alternatively, the finiteness of the variance follows from Theorem 6.8.1 in [13|, 
which states convergence of the normalized size under the condition E[(^^]^ e"'^'^')'^] < oo, which 
is again obvious, since K < oo. 

Remark 2.1. The process T{t) has an alternative construction, which we state here and refer to later. 
Define a countahly infinite number of independent random variables d^, indexed with the elements of 
M , as follows. Let = 0, and for x = iii2 ■ ■ - in, ^^.t dx be exponentially distributed with parameter 
w{in — 1). Denoting the parent of x by p{x), we define f0 = and 

T'x = T''p(x) + + ^p{a;)2 + • • • + ^p{a;)i„- 

It is straightforward that with T (t) := {x J\f : fx < t} , the process T has the same distribution as 
T. 



2.3 Limiting Objects 

Let Ti^,j.{t) = denote the subtree of T(t) rooted at x, which is the set of descendants of x 

(including x) that are born up to time t. (Note that t here is total time, and not the time since birth 
of X. In particular, 1X^^.(0) | = if x is not the root.) For every x G J\f, we introduce the variables 
0a;, corresponding to the growth of the subtree under x, analogously to 0, 

G,; := lim e-^*(*-^-)|T4,(i)! . 

t— >-oo 

The letter G refers to the variable corresponding to the root. Clearly, for every x € J\f, the random 
variables G^ are identically distributed. The basic relation between the different G^ variables in the 
tree is that for any x € M, 

K 

G. = J]e-^*(^--^^)G,i, (5) 

i=l 

which is straightforward from |T4,2;(i)| = 1 + I ''f'^xi (*) I • 

Now let us ask the following question. Fix a vertex x € A/", and at time t, draw a vertex Q uniformly 
randomly from T(t). What is the probability that Q is descendant of so x ^ Ct? As shown in ^ 
below, this probability tends to an almost sure limit A^; as t — )• oo, which can be expressed using the 
r and G random variables. 

We can now, for any n € N, define a random measure fin on the finite set {x : \x\ = n} (on the n^^ 
generation of the tree) , by 

/i„,({x}) := Ax . 

This is a probability measure almost surely, which follows from the facts A0 = 1 and Ay = X^^^ Ayi. 
Let Hn denote the entropy of that is 



Hn = -Y, Ax log A, 

\x\=n 
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2.3.1 A Measure as the Limiting Object for the Tree 

Let dN denote the set of leaves of the complete tree: dM = {1,2,... The concatenation 

xy makes sense for x £ J\f and y € dJ\f, and then xy G dJ\f. Also, for x £ J\f and z € dJ\f, we 
write X ~< z if 3y £ dM such that z = xy. For x £ M we denote the set of leaves under x by 
dAf{x) = {z£dM : x^z}. 

Let dN be equipped with the usual metric 

= A"'^^^"^^^^'i"=^i"^ (7) 

where < A < 1 is an arbitrary constant. This constant is often chosen to be 1/e, which makes certain 
formulae appear simpler. Yet we will not fix the value, so that our formulae express the dependence 
of the studied quantities on this arbitrary choice. 

With the help of the random limiting measures, we define // on the cylinder sets dN{x) of dM 

by 

IJ.{dM{x)) := /U„({x}) = A^. , if = n , 

and then we extend // from {dM{x) : x £ M} to the sigma-algebra generated (on dM). Our results 
concern the properties of this extended random measure ^. 

Remark 2.2. Now we can tell why we use the continuous and not the discrete-time model in our 
work. The limiting relative weights defined in also make sense and are interesting in the 
discrete-time setting, just like the measure fi and the entropy Hn- Our results are formulated in terms 
of these quantities. However, the limiting "absolute^' weights Qx, which will play a central role in the 
proofs, don't make sense in the discrete-time setting. 

2.3.2 Dimensions of Measures: Definitions 

For the reader's convenience, let us review the definitions of local dimension, Hausdorff dimension 
and packing dimension of measures. The lower and upper local dimensions of /i at x are defined in 
[S] (2.15) and (2.16) as 

dimio^/i(x = hmmf , (8) 

r-i-o log r 

dimiocM(x) = limsup — — ^— — ^, (9) 

r-^o log r 

where B{x,r) is the ball of radius r centred at x. If the lower and upper local dimensions coincide 
at some x, they are called the local dimension at x. The Hausdorff and packing dimensions of fi are 
defined in |9J (10.8) and (10.9) as 

dimn /i = sup{s : dim j„^/i(x) > s for ^-almost all x}, (10) 
dimp /i = sup{s : dimioc/i(2;) > s for ^-almost all x}. (11) 

The name of these dimensions come from the fact ([9] (10.10) and (10.11)) that 

dimn /i = infjdiniH E : E is a Borel set with f^{E) > 0}, 
dimp fi = infjdimp E : E is a Borel set with IJ-{E) > 0}. 

We are ready to state our results. 

2.4 Results 

Theorem 2.3. The limiting entropy 



h := lim — 

n— >cxD n 



exists and is constant with probability one. 
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Theorem 2.4. The Hausdorff dimension dim// /i and the packing dimension dimp fi of the measure 
H are constant and equal with probability one, and h and the dimensions satisfy the relation 

dim// ij, = dimp /x = , (12) 

— log A 

where A is from Moreover, the local dimension of /i equals dim// = dimp fi at fi- almost every 
point. 

Theorem 2.5. Furthermore, an explicit formula for h is given: 
This can be computed given the weight function w. 



2.5 Some Related Models and Results 

In the last decades there has been much progress in describing the asymptotic structure of randomly 
evolving trees, especially tree growth processes based on fragmentation processes. These processes 
are closely related to our model, see Remark 13.61 Limiting objects called "random real trees" and 
"continuum random trees" were introduced, to which the evolving trees converge, after an appropriate 
rescaling of the distances on the tree. Much of the structure of these limiting objects is understood, 
see e.g. [iQl US [II] . 

Our concept of the limiting measure fj, is different from these. It is a measure on the set of 
leaves of the infinite complete tree (with each vertex having exactly K children), which is a metric 
space, but the metric structure is trivial: it is not a result of any spatial scaling, and it carries no 
information about the tree growth process. On the other hand, the weights given by // are a result of an 
appropriate rescaling of the tree size, where size means cardinality. In short, we are really interested in 
the asymptotic weight distribution, and not the asymptotic metric structure. This asymptotic weight 
distribution is also studied in the Physics literature, see e.g. [2], where a quantity analogous to the 
local dimension is calculated for a continuous time fragmentation process. 

Population growth models, studied excessively in the theory of branching processes (see e.g. [13]). 
are also intimately related to our model, as discussed in detail in [2l]. Scientists discussed the 
Hausdorff dimension of the set of individuals that are actually (sooner or later) born. However, in 
our model this is uninteresting, because - almost surely - every vertex is eventually born. Indeed, it 
is not the set, but the measure which captures the long-term structure of the tree well, and of which 
the dimension is interesting. 

Similarly, in the limiting continuous trees obtained in \H[ [IT] by a spatial rescaling of the 
evolving tree, the metric structure is of main interest, and the Hausdorff dimension and Hausdorff 
measure of sets are the natural questions to ask O [7] - unlike in our setting. 

The continuous time version of our tree growth process can also be translated into a branching 
random walk, with time turning into displacement. Then the asymptotic growth can be described 
analogously, see the Biggins theorem in 0] or jl6j . However, with that point of view, the natural 
questions about the limiting structure are quite different. 



3 Main Line of the Proof 
3.1 Idea of the Proof 

The random limiting measure /i depends on the random growth of the tree. The idea of the proof is 
the following: we define a random leaf in the limiting tree according to the measure fj,. The way the 
random leaf is defined is based on a step-by-step construction of the subsequent generations of the 
limiting tree, together with a step-by-step construction of a path from the root to the random leaf. 
This is done in such a way that a Markov process appears naturally along this path, and the local 
dimension of the measure fi in this random point can be computed as an ergodic average. It follows 
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that this average is constant with probabihty one, unconditionally. Thus, although the measure 
depends on the random tree growth, this ergodic average is constant, and it is the local dimension 
of the measure in all the ;U-typical leafs of the limiting tree. This implies that this constant is the 
Hausdorff (and also the packing) dimension of n with probability one. Some technical difficulty comes 
from the fact that the state space of the key Markov process is continuous and non-compact, so to 
apply ergodic theorems, one has to work for the existence of the invariant measure (while uniqueness 
is easy). 



3.2 Markov Structure of the Tree 

The content of this short section is mainly repetition of material from [20]. These concepts and 
statements allow for a good understanding of the tree structure, on which our main construction 
(in Section 13. 3p relies. Lemma 13.21 will also be used formally in Section 13.31 to get an easy proof of 
the fact that our step-by-step construction of the limiting tree is equivalent to the original model 
(Proposition [37 



Definition 3.1. We say that a system of random variables (Yx)xej\f constitutes a tree-indexed 
Markov field if for any x G M, the distribution of the collection of variables {Yy : x < y), and that of 
{Yz '■ X z), are conditionally independent, given Y^. 

We state the following: 

Lemma 3.2. For each x ^ M let Vx denote the vector Vx '■= (cx,6a;)- Then the collections of 
variables Ax '■= (Vy ■ x ~< y) and Bx ■= {Vz : x z\ ax) are conditionally independent, given Qx- 

Proof. Recall Remark 12. H the alternative construction of T(t). From that, it is straightforward that 
the collection Ax is in fact constructed by the set of independent variables Ax := {(Jy : x ^ y). 
Similarly, recall ([5]), and decompose Qp(^x)i where p{x) is the parent of vertex x, 

K K 

This means that if we take the set of variables Bx := (cry : x -/^ y), then Bx is constructed by Bx^{Qx}- 
Given 0^^, the two collections Ax U {©x} and Bx U {Qx} are conditionally independent, this way 
the same is true for Ax and Bx, so the statement of the lemma follows. 

□ 

Corollary 3.3. The variables {Qx)x^j\f constitute a tree-indexed Markov field. 

Proof. Direct consequence of Lemma [3721 since Vx = (o"a;,©a). □ 

Definition 3.4. We introduce the variables Rx, indexed by N . For the root we leave Rijf undefined. 
For any other vertex y' which has a parent y, so for any y' = yi with f € I, let 

■ *™ |T,,(t)| e. Ay ■ 

Notice that for x = {iii2 ■ ■ ■ in), Ax is a telescopic product, 

A — A ^n»2 ^n»2»3 ^ii...in _ p p p p 

'-^x — '-^ii . . ■ ■ ■ A ~ ^i\^hi2^iii2'i3 • • • ^h-.-in ■ 



n 



Equivalently, for \x\ = n, 

logA, = ;^logi?,i,, (13) 
1=1 

where x \i denotes the first I letters of the string x (which denotes the ancestor of x on the l-ih level 
of the tree). 
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3.3 Construction of the Random Leaf 



We will now give a different construction of the tree from the ones seen before. Namely, we construct 
the system of Vx = {ctx, Qx) variables starting from the root, and going step-by-step, from generation 
to generation. Together with these, we compute the Rx and Ax variables, and use them to construct 
a random path {yn} from the root to the edge of the infinite tree. The r/n will be chosen from the 
children of i/n-i in a "size-biased" way. We will use this path in the proofs of our results. For the sake 
of simple notation, we suppose for a moment that the maximum number of children of any vertex is 
two, that is, K = 2. It is straightforward to construct the corresponding generations and the random 
path for any K < oo. For the rest of this section we treat the distribution of O as known. 
Recall that ai, C72, Bi and G2 are independent. Keeping that in mind, using 

e = e-^*"H0i + e"^*"'02), (14) 

we will consider the conditional joint distribution of ((T2, @i, Q2), given B. (Of course, cJi is - condi- 
tionally - a deterministic function of these, but we will not use the value.) Now we can construct the 
generations, together with the random path in the following steps. 

1. Pick 00 at random, according to its distribution, and fix cr0 = 0. Also, fix i/q = 0. 

2. First generation 

(a) Pick (cT2,©i,02) according to their conditional distribution, given 00 

(b) Define Ai = i?i = ^ ^^^a^^^q (which is equal to - — and happens not to depend on 

ai). Also define = R2 = ^"+1-1'%%, - 

(c) Choose Hi according to the conditional probabilities P(yi = IjO, (T2, 0i, 02) = ^1 and 
P(yi = 210,^72,01,02) = i?2. 

3. Second generation 

(a) Repeat the steps seen before for the progeny of vertex 1, to get (cji2, 0ii, ©12) and also Ru 
and i?i2. This is done only using the information carried by 0i, conditionally independently 
of (0,02). This conditional independence is the consequence of Corollarv 13.31 Since we 
already know Ri, we can now compute the values An = RiRu and A12 = RiRu- 

(b) Independently of the previous steps, use 02 to get (CJ22, ©21, ©22), R21 and i222- We then 
also have A21 and A22- 

(c) Choose y2 from the children of yi, according to the conditional distribution given by the 
Rx variables in the second generation. Namely, if yi = 1, 

P(y2 = ll|yi = 1, ai2, ©11, ©12) = Rn 
P(y2 = 12|yi = 1, ai2, ©11, ©12) = Ru: 

and if yi = 2, 

P(y2 = 21|yi = 2, ^22, ©21, ©22) = R21 

P(y2 =22|yi =2,CT22,©21,©22) =i?22, 

conditionally independently of the entire past of the construction. 

4. n-th generation 

(a) Having constructed all the ©^ with |x| = n — 1, split these all in the way above, conditionally 
independently of each other (and the entire past of the construction), to get the Rz and A^ 
variables in the n — th generation. In particular, 

„-X*{(T^lA |-0-a;i)C) . 

(b) According to the value of yn-i, choose y-n from its children, according to the corresponding 
Rz distribution (conditionally independently of the entire past). 
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Remark 3.5. As mentioned before, our model is intimately related to a branching process, as dis- 
cussed in \21^ . In branching processes, the idea of size biasing is not at all new, as its importance is 
emphasized e.g. in |7^. 

Remark 3.6. This step-by-step construction of the tree is similar to the fragmentation processes 
discussed e.g. in f^. There the usage of "randomly tagged branches" based on size-biased choices 
is a standard technique, see Section 1.2.3. Note however, that our step-by-step construction is 
not a fragmentation process in the classical sense. In particular, the sequence of measures fin is not 
Markov: the process also "remembers" the values @x which influence how the weight /i„({x}) at x is 
further "fragmented". 

Proposition 3.7. With Vx = {crx,Qx) os before, the distribution of {Vx}xej\f in the above construction 
is identical to the distribution in the randomly growing tree model. 

Proof. The statement we are proving is about the joint distribution of countably infinitely many 
(real- valued) random variables, so this joint distribution can be viewed as a measure on M^, H with 
the cT-algebra of measurable sets being the cr-algebra generated by cylinder sets - defined in terms 
of finitely many of the ctx and Qx- So to prove that the two measures on - given by the two 
constructions - coincide, it is enough to see that they coincide on such cylinder sets. 

In terms of joint distributions: It is enough to see that the distributions of {Vx}xeN' coming 
from the two constructions have identical finite-dimensional marginals. In particular, it is enough to 
show that for every n, the distribution of {yx}xeM,\x\<n the above construction is identical to the 
distribution in the randomly growing tree model. 

This is easy to see by induction: 

• For n = we have chosen the law of Gg properly by construction, also erg = as it should be. 

• For n = 1, the {V^}2,g_^ are constructed to have the right conditional joint distribution, 
given 00, so the n = statement implies the n = 1 statement. In particular, the Qx for |x| = 1 
are distributed as they should be. 

• For n > 2, the same argument (the construction) gives inductively that the joint distribution of 
the {V^jxGiy is what it should be, for any family W of x-es which consists of a vertex and its 
children. However, the construction also ensures the conditional independence of {Vy}x^y and 
{yz}xT<z given Qx, as in Lemma [3^ This, together with the joint distributions of the {VrjxGVK 
(with W as above) already characterizes the joint distribution of {Vx} x&N ,\x\<n- 

□ 

From now on, we will use the alternative construction of the tree in our discussion, so Proposi- 
tion [22] is used all the time in the proof, but this will not be formally mentioned. 

Definition 3.8. Denote by T the a-algebra generated by {ax \ x G J\f}, which contains the full tree 
evolution. 

Note that for any x S M, Qx is measurable with respect to T, so T is also the u-algebra generated 
by {ax, Qx \ X € M}, namely all the data about the tree - but not about the random leaf - during 
the parallel construction of the tree and the random leaf just presented. 

The usefulness of the random leaf we constructed is shown by the following: 

Lemma 3.9. Conditioned on T, the conditional distribution of the leaf lim^y^ is exactly the measure 
jjL. Similarly, the conditional distribution of yn is exactly fin- 
Proof. The second statement can be seen by induction: fiQ obviously gives weight 1 to the single 
point = yo- Later, by construction of Vn+i, for any x ^ M with \x\ = n and any i € I we have 
P(y„+i = xi\yn = x,T) = Rxi, so if we assume inductively that P(yn = x | T) = /i„({x}) = A^;, then 
P(y„+i = xi\T) = AxRxi = '^xi = fJ'n+i{{xi}) for any \xi\ = n + 1, so yn+i is indeed distributed 
according to fin+i- 

The first statement is an immediate consequence of the second, since for any cylinder set dAf{x), 
if = n, we have P(yoo € dJ\f{x) \ T) = P(?/„ = x | T) = ^„({a;}) = fi{dM{x)). □ 

^we could write ([0, oo) x [0, oo))-^, but a measure on this can be viewed as a special case of a measure on ]R^. 
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Corollary 3.10. Conditioned on the tree, the conditional expectation of — log Ay^ is exactly H^- 
Proof. Indeed, by the above lemma, 

E(-logA^JT) = - P(yn = a:|T)logA^ = - /"n({x})logA^ = - Y ^xlogA.^ = Hn. 

\x\=n \x\=n \x\=n 

□ 

3.4 Markov Processes Along the Random Path 

The key to the proof is the fohowing easy observation: 

Proposition 3.11. The stochastic process Xn = Qy^ (n = 0,1,2, ...J is a homogeneous Markov 
process. By "homogeneous" we mean that the transition kernel does not depend on n. 

Proof. This is clear from the construction in Section 13.31 Indeed, when constructing Qy^ , only the 
value of is used, and the construction is the same on every level. □ 

The reason to construct in Section [331 the entire tree of pairs {Qx-, A^) step by step - and not just 
the random path {yn} on an already existing tree - was exactly to make the Markov property of Qy^ 
obvious. A direct proof without the step-by-step construction would also not be hard, but according 
to our taste, the underlying phenomena are more transparent this way. 

Based on this proposition and equation (|13p . the proof of our main results will be a reference to 
an appropriate ergodic theorem. However, there are two issues to deal with before. First, the state 
space of our Markov processes is continuous and even non-compact, so the unique existence of the 
invariant measure needs to be discussed. This is done in the next proposition. Second, the quantity 
— log Ry^, of which we want to calculate the ergodic average, is not an observable on the state space 
of Xn, so this state space needs to be extended. This obvious extension will be done in Corollarv l3.16l 

Before starting the main arguments, let us formulate, as a lemma, an easy observation about the 
distribution of 0. We will use this in the arguments both for the uniqueness and the existence of 
the invariant measure of Xn. From now on, we will use the notation M'*' for the set of positive real 
numbers: 

M+ = (0,oo). 

It is important that is not included, e.g. when we speak of functions being continuous or nonzero 
on R+. 

Lemma 3.12. is absolutely continuous w.r.t. Lehesgue measure on M^, with a density function it 
which is continuous and strictly positive on M"*". 

Proof. Start from the decomposition ([5]). It shows that is of the form = e~^*'^^Q where ai is 
independent of 0, which immediately implies that must be equivalent to Lebesgue measure on the 
interval from zero to its maximal value. On the other hand, > e^^*'^^0i + e^'^*'-'^^^'^^^02 implies 
that is not bounded, since 0i and 02 are independent and distributed as 0, and their prefactors 
can be arbitrarily close to 1. The same decomposition, applied once again, also implies that the 
density vr is even a continuous function (more precisely, can be chosen to be continuous), since 
being absolutely continuous w.r.t. Lebesgue measure implies that so is (since K < oo), the density 
of which is once again smoothened by = e~'^ "^^0. □ 

For the discussion of the invariant measures, let P denote the transition kernel of Xn ~ that is, 
P{t) is the conditional distribution of Xn+i under the condition X„ = t (for every t € M"^). We also 
use it as the operator acting on measures by r]P = P{t) dr](t). 

Proposition 3.13. The transition kernel P of the Markov process Xn = Qy„ has exactly one invariant 
measure. 
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Proof. Recall that the decomposition ([5]) is the key relation between the G^-es of the different gen- 
erations, on which the construction of X„ - and thus every property of the transition kernel - is 
based. 

The key observation is that P{t) is equivalent to Lebesgue measure (on M^, of course) for every 
t G M^. This (and more) is explicitly stated and proven in Lemma 14.51 However, since we feel that 
this statement is really intuitive, let us give a rough reasoning here as well. 

First, Lemma 13.121 implies that the distribution of is equivalent to Lebesgue measure on R"*". 
Recall now the construction in Section [3.31 the essence of which is that P{t) is the conditional distri- 
bution of 0' under the condition Q = t, where 0' is a random choice from the set {0i, . . . , ©a'}- Look 
again at the relation between and {0i, . . . , 0^-}, which is the decomposition ([5]), or the simplified 
form for K = 2, which is ()14p . It shows that given any value of t, the condition = t doesn't rule out 
any of the possible values of a 0j with 1 < i < K . Also, the conditioning on Q = t doesn't spoil the 
absolute continuity of 0j, and the method of randomly choosing 0' from {0i, . . . , @k} also preserves 
absolute continuity. With this, the key observation is shown. Again, see Lemma 14.51 for a detailed 
proof. 

This observation about P{t) implies that for any measure ry on M"*", the first iterate rjP is already 
equivalent to Lebesgue measure. This in turn implies that any invariant measure rj = rjP is equivalent 
to Lebesgue measure, so any two invariant measures are equivalent. 

Suppose now indirectly that there exist two different invariant probability measures. Then two 
different extremal invariant probability measures also have to exist. But two different extremal invari- 
ant probability measures must be mutually singular, which contradicts the previous argument. Thus 
there is at most one invariant probability measure. 

The existence follows from Lemma 13.151 and Lemma 13.141 Indeed, the limiting measure v of 
Lemma 13.141 has to be invariant by Lemma 13.151 □ 

Lemma 3.14. The sequence of random variables X„ = Qy^ is weakly convergent to some measure v 
on M+. 

To keep our arguments easy to follow, we delay the proof to Section 14.21 

Lemma 3.15. P is continuous with respect to weak convergence of measures. 

The proof is delayed to Section 14.31 

Corollary 3.16. The stochastic process Yn = iG>y„,Ry„) (n = 1,2, ...) is a homogeneous Markov 
process, for which the transition kernel has exactly one invariant measure. 

Proof. Notice that during the construction of the tree in Section \'S.'S\ Ry„ is constructed by using only 
the value of Qy„_i (not even Ry„_i), in a time-homogeneous way. Thus Yn is really homogeneous 
Markov. Let P denote the transition kernel. Prom the construction, fjP depends only on the first 
marginal of fj, and on this marginal it acts exactly like P. So for any measure P with first marginal v, 
D := i)P is invariant by the invariance of v under P. The uniqueness is obvious from the uniqueness 
of I/. □ 

Now we are ready to apply an ergodic theorem on the sequence — log Ry^ to get the central 
technical result, from which our first two theorems easily follow. 

Corollary 3.17. The limit h := — lim„_j.oo ;^logAj,^ exists and is constant with probability one. 

Proof. — log Ry^ is an observable on the state space of Yn, and h is exactly the ergodic average of 
this observable by (fT3|) . So it is guaranteed to be constant by the unique existence of the invariant 
measure and Theorem 1.1 in Chapter X of [6]. We give the details of the (standard) argument now. 

Theorem 1.1 in Chapter X of [6| states that "If > 0} is a stationary Markov process, and 

if z is an invariant random variable, then z is measurable on the sample space of xq" . To formally 
apply this theorem to our process, we first need to construct a stationary version of Yn. Namely, 
let Yn be the Markov process with generator P started from Yq which is distributed according to 
the unique invariant measure u. For this process, the ergodic average of an observable, being an 
invariant random variable (see pj , Chapter X for the definition) , is by the above theorem measurable 
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on the state space - that is, constant with probabiUty one, conditioned on the initial value (more 
precisely, for P-a.e. initial value). But in our case, this constant is indeed independent of the initial 
value - actually, it is constant for every initial value, since P brings any measure (e.g. a point measure 
concentrated on any point) into a measure equivalent with i>. Now notice that the property that 
the ergodic average is the same constant with probability one, independently of the initial state, is a 
property of the transition kernel P only (and not of 1^ as a stochastic process), so it also holds for 
the process Yn- □ 

Remember that is a conditional expectation of — ;ilogAy^ by Corollarv 13.101 So since we 

have just shown the almost sure convergence of —^logAy^, the almost sure convergence of 
follows, if we have e.g. dominated convergence. This will be guaranteed by the following lemma. 

Lemma 3.18. Let fl be any Borel probability measure on dN , with K < oo. Using the notation in 
Section \2.3.1\ for every x € dJ\f let 

fn{x) := --\ogil{dM{x\n)). 
n 

Then f := sup„ /„ is integrable with respect to the measure jl. 

The proof is delayed to Section 14. li Now we are ready to prove the main results of the paper. 

Proof of Theorem \2.3[ For every x G dM let /„(x) = — ;i log /i„({x|„}) = — ^ log //(5AA(x|„)). By 
Lemma [3.9t Corollarv 13 . 1 71 states exactly that for almost every realization of the tree, fn{x) converges 
/i-almost surely to h. 

Now divide the statement of Corollary I3.1UI by n to get 

= E ( --logAy„|T ) = / --log(^„({x}))d/i„(x) = / /„(x)d/i(x). 

'n \ n J J{x&N:\x\=n] n JdN 



We can now apply the dominated convergence theorem to finish the proof, since we can use the 
supremum as an integrable dominating function, see Lemma 13.181 □ 

Proof of Theorem \2.4\ We first show the second statement of the theorem by showing that the local 



dimension of fj, at the leaf lim^ y„ is exactly where h is from Corollarv 13. 171 Let B{x, r) denote 

the r-neighbourhood of the point x G dJ\f w.r.t. the metric ([7|). For r = A", this neighbourhood is 
formed exactly by the descendants of x|„, so i?(x. A") = dAf{x\n)- The /i-measure of this set is 

^(i?(x. A")) = fl{dU{x\n)) = fln{{x\n}) = fog A,,|„, 

while the logarithm of the diameter of this set is nlog A. Thus the local dimension of /u at the leaf x 
is ^ 

dimioc /i(x) = lim — — | ' '^^ = lim " ^ ^'^ 



n log A n^oo — log A 

(if this limit exists), by the definition in ([8]) and Q. 

Applying that to x = lim„y„, Lemma 13.91 and Corollarv 13. 171 sav that this limit indeed exists and 
is equal to for /x-almost every x, which is what we wanted to show. 

The first statement of the theorem in now an immediate consequence of the definitions of the 
Hausdorff and packing dimension of a measure in ()10p and (jlip . 

□ 

4 Proofs of Auxiliary Lemmas 

4.1 The Lemma for Dominated Convergence of the Entropies 

In this section we prove Lemma 13.181 
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Proof of Lemma \3.18[ For arbitrary M < oo, let us define the set 

fI^^ := {x : fn{x) >M} = {x: -i log /i(5AA(x|„)) > M} = {x : /i(9A/'(x|„)) < e""*^}. 

n 

Since /„ takes constant values on the cylinder sets, we have 

KFm) < i^"e-"*' = (Ke-^'^)" . (15) 

Now we define 

Fm := {x : fix) > M} = \J{x : > M} C |J f], 



pin) 

■ J \^) ^ ^" / — ■ Jn\-^) ^ / ^ -I 

n n 

By (Hg), for M > log(2K), 

oo 
n=l 

Thus, since / > 0, 

/oo 
f{x) d/i(x) < ^ M/l({x : M - 1 < /(x) < M}) < oo. 



A/=l 



□ 



4.2 Limiting Distribution of O^^ Along the Random Path 

In this section we prove Lemma 13.141 We begin with three lemmas of elementary probability whose 
statements do not rely on the setting of the paper. 

The first one is a trivial generalization of the ordinary weak law of large numbers. We could call 
it "Weak law of large numbers with arbitrary weights" . For this purpose, we will consider a sequence 
of probability vectors {p"}^^]^, where, again, each is a probability vector = {pi,P2, ■ ■ ■ ^Pn^)- 
We plan to calculate weighted averages of independent random variables with weight vectors p". We 
expect such an average to be close to the expectation, if every term has a sufficiently small weight. 
So we will say that the sequence is proper if 

hm max{p" : 1 < j < A^„} = 



Lemma 4.1. Let uq he a probability distribution on M with finite expectation m. Let be a 

proper sequence of weight vectors, and let Vn be the distribution of 

where Zi, Z2, ■ ■ ■ , are independent random variables with distribution vq. Then 

Vn =^ fn. 

Note that this is the usual weak law if p'j = ^ (j = 1, . . . , n). 

Proof. The proof is trivial following the standard proof of the weak law with characteristic functions. 

□ 

Now we turn to a lemma which could be called "size-biased sampling with arbitrary extra weights" . 
For this purpose, let p = {pi,P2, ■ ■ ■ ,Pn) be a probability vector, and let Zi, Z2, ■ ■ ■ , Zj\f be random 
variables on M"^ (meaning P{Zj > 0) = 1). We will say that the random variable V is the size-biased 
random choice from Zi, Z2, ■ ■ ■ , Z^ with extra weights pi,p2j • • • ,P7Vj if it is constructed the following 
way: 
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1. Generate a realization of (Zi, Z2, . . . , Z^), and call it (zi, Z2, ■ ■ ■ , z^). 

2. Having that, choose a random integer J from the index set {1, 2, . . . , N} with the weight 



given to each j. 
3. Set V = zj. 

Note that this is the usual size-biased random choice if all the pj are equal. Our lemma states that 
this size-biased random choice with extra weights behaves just like the ordinary one, provided that 
every weight is small. 

To state the lemma, let be a probability distribution on with finite expectation m. We will 
say that the measure is the size-biased version of uq, if it is absolutely continuous with respect to 
uq, and the density is p{t) = ^t. In other words, = f^tdi'o{t). 

Lemma 4.2. Let vq he a probability distribution on with finite expectation m. Let be 
a proper sequence of weight vectors, and (for each n) let Zf , Z2 , ■ ■ ■ , be independent random 
variables with distribution uq. Let Vn be the random choice from Z", Z2, . . . , Z"^^ with extra weights 
Pi,P2, ■ ■ ■ iPn^- ^ size-biased version of uq. Then 

Vn =^ y. 

Proof. Let F denote the cumulative distribution function of that is, F{t) = z^([0,t]). Let -F„ denote 
the cumulative distribution function of Vn- For some fixed t, we write it in the form 

Fnit) = E(P(K < t I {Z;}f^i)). (16) 

The conditional probability inside is just the weight of j-s with Zj < t, so 



P(K < 1 1 {Z^} 



Y.%v-z- 



According to Lemma l4.ll the denominator converges weakly (and thus, in probability) to E(Z") = 
m > as n ^ 00. Similarly, the numerator converges in probability to 

E(Zf]L(Zf <t))= [ tl(t < t) di/o(t) = mi^{[0, t]). 
Jr+ 

This implies that the quotient converges weakly to z^([0, t]) = F{t). Since this quotient is a conditional 
probability, it is obviously bounded by 1, so (fTUj) implies that Fn{t) — )• F{t). □ 

The following lemma is just a re-statement of the previous one. This is the form that we will use. 

Lemma 4.3. Let be a probability distribution on with finite expectation, and let u he its size- 
biased version. Let <j) be a bounded continuous function on . Then for every e > there exists a 
S > such that for any probability vector {pi,P2, ■ ■ ■ ,Pn) which satisfies that 

max{pj : 1 < j < A^} < 5, 

if Z\, Z2, . . . , Zf4 are independent with distribution vq, then the size-biased random choice (called V ) 
from Zi, Z2, . . . , Zjsj with extra weights Pi,P2, ■ ■ ■ ,PN satisfies 

miV))- I ^{t)du{t)\<e. 

Before proving Lemma [STTU we need one more tiny statement about the structure of the growing 
tree. 
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Lemma 4.4. For any vertex x € A/", let 

T, = e-^'^% (17) 

and for every x with \x\ = n let 



Then the sequence p"''^°-^ ;= max{px : |x| = n} converges to zero in probability. 

Proof. We prove the stronger statement that p"'™^^ converges to zero with probabihty one. We use 
the form 

n,max ni.ax{Tx : |x| = n} , . 

P - 7f^ • \^^) 

Z^\y\=n 

We show that the numerator converges to zero with probabihty one, while the denominator con- 
verges to a positive limit with probability one. 

1. If the numerator does not converge to zero, then there is some e > and there are infinitely 
many vertices x € TV with Tx > e. Then, for all these x we have Tx < t* := so infinitely 
many vertices are born within the finite time r*. This is known to have probability zero - see 
comment at ([3]). 

2. Iterating the decomposition of G, we get 

@=Y. Tx@x. (19) 

\x\=n 

Let S„ denote the cr-algebra generated by {ax ■ x £ M, \x\ < n} - that is, the complete history 
of the tree growth up to the n-th level. Similarly, let S denote the a-algebra generated by 
{ax ■ X G M}. Clearly $]„ C Sn+i, ^ is generated by U„S„, and is S-measurable. So Levy's 
'upward' theorem ensures that E(0 | S„) Q with probability one. However, if |x| = n, then 
Qx is independent of S^, while Tx is S„-measurable, so (|19p implies that 



E(e I S„) = ^ TxEQx = EG ^ Tx, 

\x\=n \x\=n 

SO with probability one the denominator of converges to ^ ^ 0. 

□ 

Now we can complete the goal of this subsection: 

Proof of Lemma \3.14\ Actually we give the limit explicitly. Let u be the measure on M"*" with density 
function cxit{x), where tt{x) is the density of 0, and c = ^ is a normalizing constant. We will show 
that 

Xn ^ ly. (20) 
Let us look directly at Xn = @y„ for some fixed n. This can also be constructed in the following way: 

1. Generate the birth times Tx for all vertices x with |x| = n (that is, on the n-th level of the tree). 
This defines the values Tx = e~^*'^'^ , \x\ = n. For better transparency, let us normalize these 
values to get a probability distribution on the n-th level of the tree: px '■= "^"^ ^ (for |x| = n). 

2. Also generate the random variables Qx for |x| = n, which are independent of the px- 

3. Now yn is chosen from the points \x\ = n according to the distribution so the weight given 
to some x is 

Aa; _ TxQx _ Px&x 

J2\z\=n^z J2\z\=n'^z®z J2\z\=nPz®z 
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So, having the values px fixed, the value X„ = Qy^ is the result of a size-biased sampling from the 
independent random variables Q^, \x\ = n, with additional weights px - just like in the context of 
Lemma 14.21 and Lemma l4.3i 

Now we can prove ()20p . Let (phe a fixed bounded continuous function on M^, let be an upper 
bound of |(/)|, and let m,f, = (j)du (which satisfies \m^\ < M^j,). Let e > be arbitrary. 

Choose 6 > according to Lemma 14.31 so that if all the px on some level |x| = n are at most 5, 
then 

|E((/>(X„) I {px})-m^\ < e. 
Lemma 14.41 implies that there exists an uq such that for all n > uq, 

P{max{px : = n} > S) < 



2M^ 

Let ^n,5 denote the event that maxjp^: : \x\ = n} < 5. For n > riQ we get 

|E {<P{Xn)) -m^\< j |E (0(X„) - \ {px}) \ dP = 
= I |E ((/>(X„) - 1714, I {Vx}) I dP + /" |E (0(X„) - | {px}) \ dP < 



<2MsP{n^„,)+ I edP <e + e = 2e. 

□ 



4.3 Weak Continuity of the Transition Kernel 

This section is devoted to the proof of Lemma 13.151 

Proof of Lemma \3.15l We first show in Lemma 14.51 that the transition kernel P can be written as 
{r]P){B) = Jg k{t, s) ds dr]{t) where the kernel function k{t,s) is continuous in the first variable 
(actually it is continuous in both variables) . Lemma 14.61 - which is a pure probability statement - 
says that such a kernel is continuous with respect to weak convergence of measures. □ 

In the lemma, we show a little more than what is needed for the above proof. In particular, we 
also show that the kernel function k{t,s) is nowhere zero on R+ x M+, because this is used in the 
proof of Proposition 13.131 

Lemma 4.5. The transition kernel P can be written as {r]P){B) = J^k{t,s)dsdt where the 
kernel function k{t,s) is continuous in both variables (in its domain {t,s) € x M^j, and strictly 
positive. 

Proof. For the time of the proof, let Q and Q' denote two consecutive values of the process, say 
:= Xn = Qyn, 0' = Xn+i = &y„+i- So the kernel function k{t, s) is just the conditional density of 
O' (as a function of s), under the condition Q = t. So 

k(t s) - 
K^[t,s)- ^^^^ , 

where p{t, s) is the joint density of the pair (0, 0'), and -7r(t) is its first marginal - that is, the density 
of 0. 

We know from Lemma 13.121 that is indeed absolutely continuous w.r.t. Lebesgue measure, and 
the density it is continuous and nonzero on R~*". Knowing this, we now show that p{t,s) is also 
continuous in both variables and nonzero on M"'" x M"*", which completes the proof. 

We restrict to the case K = 2. The case of a general K < oo causes no additional difficulty 
other than messy notation. Following the construction of the tree in Section 13.31 we start with 
CTi, CJ2, 01, 02 independent, with cjj being exponentially distributed with parameter w{i — 1)/A* and 
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Oi being distributed as (i = 1,2). We introduce the temporary notation Si = e ^ and denote 
its density by gi. Explicit calculation gives that 



5.(n) = ^^^."^-^l(o,i)(n), (21) 



of which we will only use that ugi{u) is bounded. 
Denote the joint density of {Si, S2, Oi, ©2) by 

f{ui,U2,ti,t2) = giiui)g2{u2)TT{ti)7r{t2). 

We define 

e = 5iei + 515262 = 5i(Gi + 52G2). 

To get the appropriate joint distributions, in the probability vector {Si, S2, Qi, O2) we replace Si by 
O, so let us denote the joint density of {Q, ^2, ©1, ©2) by /. The density transformation formula gives 

f{t,U2,ti,t2) = ^ f{ J ,U2,ti,t2) = 7 , J . 9l{ . J . )92{u2)'JT{tl)7r{t2). 
tl + ^212 h + U2t2 t ti+ ti2l2 h + 1*212 

According to the construction, ©' is chosen to be either ©1 or ©2, with conditional probabilities (given 
(52, ©1, ©2) and conditionally independently of ©) 

P(©' = ©ll52,©l,©2) ®' 



P(©' = ©2[52,©1,© 



©1 + 52©2 ' 
52©2 



2 J 



©1 + 52©2 

So the joint density of (©, ©') is 

PiijS) = // -—^^—-f{t,U2,S,t2)dt2dU2+ —^^^^^^f{t,U2,ti,s)dtidU2 = 

JjR2tl+U2t2 JjR2ti + U2t2 

fl{t,S,U2,t2)dt2du2+ f2{t,S,U2,ti)dtidU2 (22) 

J Jr^ 

All there is left is to show that both integrals on the right hand side are continuous and nonzero for 
{t,s) € M"*" X M^. Now the integrands fi and /2 are not exactly continuous, but they are continuous 
on their supports. H On the other hand, for every {t, s) E x M+ the support of each integrand is 
a nice set (described in the footnote) with a boundary of Lebesgue measure zero. That is, for every 
{to, so) G IR+ X M+, 

fi{t,s,U2,t2) fi{to,so,U2,t2) for Lebesguc-a.e. (u2,t2) G IR^- 

To get the desired continuity of the first integral by the Lebesgue dominated convergence theorem, 
we only need to find an integrable (in (■U2,^2)) uniform (in {t,s) near {to,so)) upper bound for 

sit t 

fl{t,S,U2,t2) = — — -— r5'l(— r^92{u2)T^{s)7r{t2). 

S + U2t2 t S + U2t2 S + ^2*2 

The first factor is at most 1, and the product 77^^51(77:^^) is bounded because ugi{u) is bounded 
due to ([2T]l . So we have 

h{t, S, U2,t2) < c\^{s)g2{u2)TT{t2) < C{^ + l)(7r(s) + 1)^2 (n2)vr(t2) 
t to 

if {t,s) is close enough to {to,so), since jTt{s) is continuous in {to,so). This upper bound is clearly 
integrable in (112,^2), so the dominated convergence theorem ensures that the integral is also contin- 
uous. 



^The supports of the two integrands are actually not the same. Both of them are characterized by the system of inequalities 
{0 < tl, ^2; < 7*2 < 1; < < 1}: hut with the choice s = ti or s = ^2, respectively. 
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The second integral in (j22p can be shown to be continuous in exactly the same way. Thus the 
continuity of k{t, s) is proven. 

To get that p{t,s) (and thus k{t,s)) is strictly positive on M+ x M+, we only need to note that 
the support of the integrand is nonempty for every {t, s) G x in both integrals on the right 
hand side of (|22p . This comes again from (|14p . which shows that any pair of positive values is possible 
for (0,0i) (in case of the first integrand) or for (0,62) (in case of the second integrand). (See the 
footnote [3] for explicit formulae.) The integrands are of course also non-negative, so both integrals 
are positive. □ 

Lemma 4.6. Let k : x [0, 00) be a function continuous in the first variable, such that for 

every t € M"^ the function k{t, .) is a probability density on M"*" - that is, Jjg+ k{t,s)ds = 1. Let the 
operator P be defined on Borel probability measures of by 

{r]P){B) := [ [ k{t,s)dsdr]{t) 
Jr+ Jb 

for every Borel probability measure rj on and every Borel set B C M"*". Then P is continuous with 
respect to weak convergence of measures. 

This lemma is an easy consequence of the following: 

Lemma 4.7. Let k : M"*" x M"*" — >• [0, 00) be a function as in Lemma \4-6[ and for every t G M"*" let Kt 
denote the measure on with density k{t, .). Then if tn is a sequence in M"*" converging to t, then 
Kt„ converges to Kt weakly. 

Proof. By assumption, {k{tn, is a sequence of density functions converging pointwise to the 

density function k{t,.). This implies weak convergence of the corresponding measures through the 
Fatou lemma: for any Borel set B C 

f Fatou f f 

liminf irt„(i?) = liminf / k{tn,s)ds > / liminf A;(t„, s) ds = / k{t, s) ds = Kt{B), 
similarly 

liminf Kt^{B')>Kt{B'), 

ra— ^00 

which implies 

\imsup Kt^{B) = 1 - limmf Kt^{B') < 1 - Kt{B') = Kt{B). 

n— >oo n— >-oo 

These together give 

KtSB)^K{B). 

□ 

Proof of Lemma Let : — t- M be bounded and continuous and let rjn be a sequence of measures 
on converging weakly to ry. By the definition of P, 

(t>d{'nnP) = I A;(t,s)(/)(s)d(r/„(t) X Leb(s)) = 

JK+XIR+ 



k{t, s)(j){s) ds 
The function 

^{t) := / k{t,s)(l){s)ds 



dr]n{t). 



is obviously bounded, and also continuous: this is exactly the statement of Lemma 14.71 But then the 
weak convergence of rjn to r] means exactly that 

0(t)dr?„(t) ^ [ mMt), 

Jr+ 
so we have 

/ ^divnP)^ [ ^{t)drj{t)= [ </.d(ryP) 
Jr+ Jr+ Jr+ 

for every bounded continuous (j), which is exactly what we want to prove. □ 
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5 Computation of the Entropy 

Proof of TheoremlKM We know that ^Hn = A^. log converges almost surely to some 

constant h, and this constant is equal to the limit of the expected values. For this section we use the 
shorthand notation already introduced in ()17p . 

^, = 6"^*^^. (23) 

To compute h, first observe that 



E ^ A,eiog(A,.e) = E ^ GA^logAj +E (9 log 6) ^ A,. 

|x|=n \ [2:[=n / \ 

E j e ^ A,.logAj +E(eiogG), 

V \x\=n J 

where we have used that X]|x|=n = 1 by definition. 

Next we observe that on the other hand, the same expression can be written as 

E A,eiog(A,.e) = E ^ r,.e,iog(r,G,) = 

|x|=n |a;|=n 

E j e,r,iog(r,) j + e I ^ r,.e,iogG, 

\ |x|=n / \|a;|=n 

Y (EG,)E(T,logT,)+ Y E(T,)E(G,logG,) 

2:|=n \x\=n 

(EG)E (7;iogT,) + E(QlogG)E [ , , 

|a;|=n \ |x|=n 

where we have used that for any x € J\f, Qx and Tx are independent. Recall that E ^X]|x|=n^^) ~ 
Since @ implies that E(QlogQ) < oo, comparing the two formulae gives the conclusion 

E j G J] A,.logA, J = (EG)E j Y Tx^ogTx j . (24) 

\ |3;|=n / \la:|=n / 

We compute the right-hand side with an induction, 



yl„ := E I Y TxlogTx 1 = E I Y Y.^v^^^'^^y' 

\x\=n I \\y\=n~l i=l 



('E^e-^*("i''-"^)^ E [ Y Ty\ogTy\ + 

\ i=l ) \\y\=n-\ I 

|e Y T^y I E [^^e-^*(""^"-"'fhoge~^*(""«'-""'^A = 

V \y\=n-l I \i=l ) 

^n-l+E ^^I-logTi^ ' 



so 



20 



Now write this back to ()24p to get 



E (©^i^n) = (EO) E ^- T, log T^j 



Since lim -Hn = h almost surely and E0 < oo, we can apply the dominated convergence theorem 
if we check that —Hn is bounded. This follows from the standard upper bound for entropy of measures 
on the finite set {x G dAf : |x| = n}, which has K"' elements, coming from the Jensen inequality: 



Ef X Jensen 

/i„({x})log^„({x}) = / log -;— j-d^„(x) < 

^1="- {xedJ\f:\x\=n} 

- / MM) = ^"^^"^^MM) = = "^^^'^ 

{x(^dAf:\x\=n} 1^1=^^ 

SO < logi^. Now dominated convergence gives 

/i = E^-^raogr,^ • 



Recalling ()23p . the proof of the theorem is complete. □ 

Remark 5.1. This value can be explicitly calculated, as soon as the weight function is given, since the 
Ti variables are the sum of independent, exponentially distributed random variables with parameters 
(^(j))j=o- Alternatively, with the function g defined in 



dA 



A=A* 



6 Outlook 

The present result is restricted to the K < oo case, i.e. when a vertex can only have finitely many 
children. This property is used in three places. First, Theorem 12.31 relies on Lemma 13.181 which is a 
very rough estimate working for finite K only. Second, in the proof of Theorem 12.51 we use the fact 
that is bounded - which is also certainly false for K = oo. Third, showing the continuity of the 
density vr and the transition kernel function k (in lemmas 13.121 and 14. 5p is easier using the fact that 
the sum in (0) is finite. With more care, these could possibly be generalized for the K = oo case, so 
the main result about the Hausdorff dimension. Theorem 12.41 could be shown in greater generality. 
However, not having the explicit formula of Theorem 12.51 is a serious drawback. We believe that the 
problem can be solved - and the validity of the explicit formula can be shown - for a large class of 
rate functions with = oo by a detailed analysis of the transition kernel P. Such an analysis could 
be avoided in the present paper by the study of the limiting distribution in Section 14.21 We plan to 
return to that in the future. 



Acknowledgements 

We gratefully thank Balazs Rath for the simple proof of Lemma 14.21 We are also grateful to an 
anonymous referee for many useful suggestions that helped improve the quality of the paper. A. 
Rudas acknowledges the support of OTKA grant K60708. I. P. Toth acknowledges the support 
of OTKA grants PD73609 and K71693, and is also grateful to the European Research Council for 
support. 



21 



References 



[1] Albert-Laszlo Barabasi and Reka Albert. Emergence of scaling in random networks. Science, 
286(5439) :509-512, 1999. 

[2] J. Berestycki. Multifractal spectra of fragmentation processes. Journal of Statistical Physics, 
1 13(3) :41 1-430, 2003. 

[3] J. Bertoin. Random fragmentation and coagulation processes. Cambridge Univ Pr, 2006. 

[4] J. D. Biggins. Martingale convergence in the branching random walk. Journal of Applied Prob- 
ability, 14(l):pp. 25-37, 1977. 

[5] Bela Bollobas, Oliver Riordan, Joel Spencer, and Gabor Tusnady. The degree sequence of a 
scale- free random graph process. Random Structures Algorithms, 18(3):279-290, 2001. 

[6] Joseph Leo Doob. Stochastic Processes. Wiley, 1953. 

[7] T. Duquesne. Packing and Hausdorff measures of stable trees. Levy Matters I, pages 93-136, 
2010. 

[8] Thomas Duquesne and Jean-Franois Le Gall. Probabilistic and fractal aspects of Levy trees. 
Probability Theory and Related Fields, 131:553-603, 2005. 

[9] Kenneth Falconer. Techniques in Fractal Geometry. Wiley, 1997. 

[10] B. Haas and G. Miermont. The genealogy of self-similar fragmentations with negative index as 
a continuum random tree. Electronic Journal of Probability, 9(paper 4):57, 2004. 

[11] B. Haas and G. Miermont. Scaling limits of Markov branching trees, with applications to Galton- 
Watson and random unordered trees. Arxiv preprint arXiv:1003.3632, 2010. 

[12] B. Haas, G. Miermont, J. Pitman, and M. Winkel. Continuum tree asymptotics of discrete 
fragmentations and applications to phylogenetic models. The Annals of Probability, 36(5):1790- 
1837, 2008. 

[13] Peter Jagers. Branching processes with biological applications. Wiley-Interscience [John Wiley 
& Sons], London, 1975. Wiley Series in Probability and Mathematical Statistics — Applied 
Probability and Statistics. 

[14] P. L. Krapivsky and S. Redner. Organization of growing random networks. Phys. Rev. E, 
63(6):066123, May 2001. 

[15] P. L. Krapivsky, S. Redner, and F. Leyvraz. Connectivity of growing random networks. Phys. 

Rev. Lett., 85(21) :4629-4632, Nov 2000. 

[16] R. Lyons. A simple path to Biggins' martingale convergence for branching random walk. In K.B. 
Athreya and P. Jagers, editors. Classical and modem branching processes. The IMA volumes in 
mathematics and its applications. Springer, 1997. 

[17] R. Lyons, R. Pemantle, and Y. Peres. Conceptual proofs of 1 log 1 criteria for mean behavior of 
branching processes. The Annals of Probability, 23(3):1125-1138, 1995. 

[18] T. F. Mori. On random trees. Studia Sci. Math. Hungar., 39(1-2):143 155, 2002. 

[19] Roberto Oliveira and Joel Spencer. Connectivity transitions in networks with super-linear pref- 
erential attachment. Internet Math., 2(2):121-163, 2005. 

[20] Anna Rudas and Balint Toth. Random tree growth with branching processes - a survey. In B Bol- 
lobas, R Kozma, and D Miklos, editors, Handbook of Large-Scale Random Networks, volume 18 
of Bolyai Society Mathematical Studies, chapter 4. Springer, 2007. 

[21] Anna Rudas, Balint Toth, and Benedek Valko. Random trees and general branching processes. 
Random Struct. Algorithms, 31 (2): 186-202, 2007. 



22 



